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THE PRINCIPLE OF GAME THEORETICAL EQUILIBRIUM 

Consider a discrete alphabet A and probability distributions P,Q,--- over A. The set 
of all such distributions is denoted M:J_(A). A distribution is identified by its point 
probabilities: P = (p;); e A- A measure of complexity is a map which to each pair (P, Q) 
of distributions assigns a value 4>(P, Q) £ [0,°°] such that, for each P G A/|(A), the 
minimal value of <3>(P, Q) with Q G M:|_(A) is assumed on the diagonal, i.e. for Q = P 
and nowhere else unless <t>(P,P) = °°. 

A preparation is any non-empty subset & C Mi (A). When 8? is fixed, a consistent 
distribution is a distribution in The game y = y(4>, has $ as objective function 
and is the two-person zero-sum game between Player I ("Nature" ), who can choose 
a strategy P G and Player II ("the Physicist" ) who can choose any strategy Q G 
Af|_(A). Player I is a maximizer, Player II a minimizer. Thus val/ defined by val/ = 
sup Pe ^infg<I>(.P, <2) is the Player I-value of the game and, similarly, val// defined by 
val// = infgsup Pe ^4>(.P, 2) is the Player II-value of the game. Here and below, a 
variable denoted by Q is understood to vary over all of M+(A). 

An optimal Player I-strategy is a P G & such that val/ = infg4>(P, 2) and an optimal 
Player Il-strategy is a <2 G M|(A) such that val// = sup Pe ^4>(P, 2). By the general 
minimax inequality, val/ < val//. The game is in equilibrium if val/ = val// < °°. 

For further information about the game introduced, see [4]. The attempt to locate 
optimal strategies for the players and to establish equilibrium for suitable preparations 
is taken as a basic principle of statistical physics, the principle of game theoretical 
equilibrium (GTE). 

We introduce ^-entropy of P as minimal complexity, i.e. as H(P) = infg4>(P, Q). By 
assumption, H(P) = 4>(P,P), thus, val/ = supp e jaH(P), which is the maximum entropy 
value, also denoted MaxEnt = MaxEnt(4>, So val/ = MaxEnt and we realize that 
the GTE-principle leads directly to Jaynes maximum entropy principly, cf. [5]. 
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Classical Boltzmann-Gibbs-Shannon entropy (BGS-entropy) is obtained as minimal 
complexity with respect to the measure (P,Q) rx Epfln^ which has a clear and con- 
vincing interpretation related to coding. Our results go some way to establish reasonable 
interpretations also for more general measures of complexity. Regarding the origin of 
the the above measure of complexity, under the name of inaccuracy, see Kerridge [6]. 

As we have seen, entropy is generated by complexity. So is divergence ( cross entropy, 
relative entropy or redundancy), defined as actual minus minimal complexity: D(7>, Q) = 
4>(P, Q) - H(P) when H(P) < °°. In any case, the linking identity <J>(7>, Q) = H(P) + 
D(P, Q) holds and D(P, Q) > with equality if and only if P = Q (for the measures of 
complexity we shall consider, it will be clear how to define D(P, Q) when H(P) = oo). 

ROBUSTNESS, EXPONENTIAL FAMILIES 

A Player II-strategy Q is robust if, for some constant h < oo, the level of robustness, 
4>(P,(3) = h for all consistent distributions P. The set S = of all robust 

Player II-strategies is the exponential family associated with y(4>, If a family JV 
of preparations is considered, the exponential family S (4>, JY) associated with jV is the 
set of distributions which are robust for all preparations & G . 

The following general and simple observation will play a key role in the sequal: 

Theorem 1 (robustness lemma). Let the measure of complexity 4> and the preparation 
& be given. Assume that the distribution Q* is robust (Q* G &)) and consis- 
tent (Q* G 2?). Then y(<t>,£P) is in equilibrium and has Q* as the unique MaxEnt- 
distribution as well as the unique optimal strategy for Player II. 

Proof. Though known from e.g. [4] we present a direct proof. 

Let h be the level of robustness. Then 2*) = h and, for P G & with P ^ 

Q*, U(P) = <3>(P,P) < ®(P,Q*) = h. Thus Q* is the unique MaxEnt-distribution. For 
any Q ^ Q*, sup Pe ^^(P,(2) > $>{Q\Q) > ®(Q*,Q*) =h = sup Pe ^<t>(P,Q*) and 
equilibrium as well as unique optimality of Q* for Player II follows. □ 

The result connects the exponential family $ with the preparation 2? . Indeed, if £ and 
intersect, they only intersect in one distribution which then is the optimal strategy for 
both players and, furthermore, the game considered is in equilibrium. 

COMPLEXITY AND LINEAR CONSTRAINTS 

We shall apply the principle of GTE - via the robustness lemma - to a wide class of 
complexity functions and associated notions of entropy, always having one and the same 
type of preparations in mind, viz. those given by linear constraints. They are the most 
important preparations for statistical physics and other applications, cf. e.g. Kapur [7]. 

>From now on, we consider a fixed finite set / = (/v)i<v<& °f real- valued functions 
defined on A. The associated family of natural preparations, denoted jV, consists of all 
non-empty sets 3^ a which are defined as follows, denoting by (-,P) mean value w.r.t. P: 

^ a = {p e M|(A) I (f v ,P) = a v for 1 < v < k} . (1) 
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Here a = (a v )i< v <fc £ K . We assume that no non-trivial linear combination of the 
/v's reduces to a constant function. Clearly, $ (<J> >t /K), the natural exponential family ; 
consists of those distributions which are robust for all natural preparations. 

We shall select special measures of complexity adapted to a study of the natural 
preparations and constructed with the aim to simplify the search for distributions in 
<%{Q,JV). To accomplish this, we consider measures of complexity of the form 

*(p,e) = §a((s(e),p>) (2) 

where, for each Q E M\_ (A) , £q is a real function and 7c maps Q E MjJ_( A) into a function 
defined on A. We insist that (k(Q) , P) can be obtained by summation based on a function 
K : [0, 1] — > [0,oo], the coding function, via the formula 

(*(G),^> = !>«■*(?/)• (3) 

;eA 

This corresponds to the requirement (k(Q))(i) = Jc(^) ; i e A. 

Regarding : [0,oo] — > [0,oo] and K : [0, 1] — > [0,°°], we assume that the <^g's are 
increasing and concave, that K is decreasing and convex, that ?c(l) =0, that K" is 
continuous at (not just at ]0, 1]) and, finally, that defined by (2) is a genuine measure 
of complexity. The last requirement will be trivially fulfilled in the concrete cases 
we shall consider. The inverse function K --1 : [0, ?c(0)] — > [0, 1] will play a significant 
role. We note that this function is continuous, decreasing and convex, as is K (simple 
geometric proof). 

For the classical example, £,q is the identity map and K the function q rx In |. Then 

is the restriction of x r\ exp(— x) to [0,°°]. Entropy generated by this measure of 
complexity is standard BGS-entropy. 

For the general situation, we note that any Q for which k(Q) is a linear combination 
of the constant function 1 and the given functions f\ , • • • ,/^, i.e. of the form 

ic(<2)=A + A 1 -/i + ---A r A = A + A-/ (4) 

for certain constants Ao and A = (Ai, • • • , A^), is a member of £(<&,^Y). Motivated by 
this observation, we fix real constants A = (Ai,-- - ,A^) and ask if there exists a real 
constant Ao and a distribution Q — (#i),- e A su ch that (4) holds. 

For abbreviation, put L, = A • /(/) . Then (4) amounts to qi = K~ 1 (Ao +Li) for i E A. As 
?c is defined on [0, ?c(0)], we must have < Ao + Lj < ?c(0) for each i. Therefore, the 
Li must be bounded below. Furthermore, from Y.i1i = 1, we conclude that, for each 
K < ?c(0), there can only be finitely many / E A with L ; < K. Thus we may order 
the Lf. Ljj < Li 2 < with this sequence breaking off and having a largest element 
if A is finite and with L ; „ — > ?c(0) if A is infinite. Put L* = L;, and L* = sup ;eA Lj 
(= ?c(0) if A is infinite). We realize that we must require that L* — < ?c(0) and, 
assuming this holds, the set of possible constants Ao is the set [— L*, oo[ in case ?c(0) = °<> 
and the set [— L*, ?c(0) -L*] if ?c(0) < °<>. Consider the function / defined by f{x) = 
LieA + £«) w i m x ' s ranging over the possible values of Ao- What we search for is 
a value of Ao, necessarily unique, such that /(Ao) = 1. 
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Clearly, /(— L*) > 1. By standard techniques, we see that / is continuous from 
the right and if /(xq) < °° for some value of xq, then / is continuous at all x > xq. 
Furthermore, if x n — > k(0) and if f(x n ) < °° for all n, then f(x n ) — > as n — > oo. 

Our analysis shows that / can have at most one point of discontinuity, viz. where it 
passes from the value oo to finite values. Such a discontinuity "normally" does not occur. 
Also other anomalies are "normally" excluded. For instance, one may easily construct 
examples such that / is constantly equal to °o but such values are also excluded as they 
are of no practical interest. Thus we maintain that "normally" the function / assumes 
finite values larger than 1 as well as values less than 1 and hence the existence of a value 
Ao with /(Ao) = 1 is assured by continuity. 

Summarizing, we can now formulate the main result: 

Theorem 2 (MaxEnt calculus). Let A = (Ai,-- - ,Afc) be given real constants. Then, 
under "normal" circumstances (cf. the discussion above), the equation 

£ jc- 1 (Ao + A ./(/)) = 1 (5) 

ieA V J 

has a solution, necessarily unique, and Q = (qi)isA given by 

q i = K- l ( y l + ^-f(i))forieA (6) 

satisfies (4) and hence belongs to the exponential family £{Q,J\f\ This distribution is 
the MaxEnt-distribution for & a with a = (ai, • • • ,ajt) given by 

«v = J^9i/v(0/° r v = !»••• , k ( 7 ) 

ieA 

and, for this value of a, MaxEnt(4>, £P a ) — ^g(Ao + A ■ a) . 

The theorem replaces and expands the standard recipe for MaxEnt-calculations. The 
main difference is a focus on Ao via (5) rather than on the classical partition function. In 
the final section we present a more thorough discussion of the significance of the result. 

Before continuing, we shall limit the type of complexity functions studied by reduc- 
ing the number of parameters needed for their definition. Instead of the many functional 
parameters appearing in (2), we now suggest a setting with only two functional param- 
eters, one function called the corrector, to account for all the functions £,q via the 
formula £q(x) = x + Y^ieA <r (id an d then the already introduced coding function K. In 
other words, we point to complexity functions of the form 

*(p,e) = E«>c( ft )+E5(«). (8) 

ieA ieA 

The functions K and £ are uniquely determined from 4>. The two terms in (8) are 
called, respectively the coding part and the correction. For the classical example, the 
coding part is EiPf m | an d the correction vanishes. 
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COMPLEXITY A LA BREGMAN 



We shall now generate a (<£,H,D) -triple from a simple starting point. The method fol- 
lows the idea of Bregman divergences and is referred to as Bregman generation. Another 
method, Csiszdr generation, was suggested in [4]. In our view, Bregman generation is 
by far the most important one for the needs of statistical physics. 

Given is a Bregman generator by which we shall understand a strictly concave and 
smooth real function h defined on [0, 1] with h(0) = h(l) =0 and h'(l) = — 1. We take 
"smoothness" to mean that h has an analytic extension to [0,°°[. Though less will do 
for most investigations, the stronger requirement allows one to consider also the dual 
function h defined by 

h»=xhQ. (9) 

This function is well-defined and real-valued in ]0,°°[. As a final technical assumption, 
we assume that the function can be extended by continuity to [0, °°], allowing for infinite 
values at the endpoints. A specific value h(p) is interpreted as the complexity of an event 
which is known to occur with probability p. 

>From h we generate two functions, § = <j>(p,q), and d = d{p,q): 

*(p,q) = h(q) + (p-qWq), (10) 
d(p, q) = h(q) - h(p) + (p-q)h'(q). (11) 

A specific value <j>(p,q) is interpreted as the complexity of an event which is believed 
to occur with probability q but actually occurs with probability p. This is consistent 
with the previous interpretation as <j>(p,p) = h(p). The function d simply measures the 
difference (divergence) between estimated and true value. We also note that (j)(p,q) and 
d{p,q) may assume the value +°o. This happens if and only if both p > q = and 
h'(0) = °o hold. 

Consider the internal functions, <I> = 4>h, H = Hh and D = Dh generated by 0, h and 
d. By this we mean that: 

*(P,G) = J>(p/,«), H(P) = J>(pi), D(P,<2) = £d(pi,fl). (12) 

ieA ieA ieA 

We refer to 0, /z and d as the partial functions, respectively partial complexity, entropy 
and divergence. They satisfy a partial version of the linking identity: 

(j>(p,q)=h(p)+d(p,q). (13) 

Note that 4> = 4> h is of the special form (8) with coding function K = K h given by 

k(x) = h'(x) + 1 (14) 

and corrector t, = t,\ l given by %(x) = h(x) — x(h'(x) + 1). Hence the Bregman generator 
is decomposed into two terms: 

h(x) =xk(x) + ^(x). (15) 
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As £ (0) = £ (1) = and = -xh"(x) - 1 we find that = if and only if we 
are in the classical case h(x) = x\n( 1 jx) . We also see that £ (x) > — x in [0, 1] , hence the 
correction related to any distribution Q is bounded below by —1. The dual function h 
appears also to be of significance. In particular, ^ (x) = h (1/x) — x, hence 

4>( J P,2) = £ A h / (c 7i ) + Ih / (i). (16) 

ieA ieA 1' 

The first term in (16) is the coding part minus 1, the second term the correction plus 1. 
Partial complexity is given by §{p,q) = ph'(q) + h(l/q). 



GENERATORS VIA DEFORMED LOGARITHMS 

We turn to a concrete two-parameter family (h a « ) of Bregman generators defined via 
deformed logarithms (taken in this form from [10]) and given by 



x a In* for a = /3 



The associated Bregman generators are defined by 

K,p(,x)=x\n a3 (\/x). (18) 

Warning: We have chosen to model the definition after the expression x ln(l/x) rather 
than —x lnx The main reason is the more natural interpretation of the former expression, 
but also, the change appears to be more as preferred in the "Tsallis literature" . The 
change is in contrast to the choice in [4]. Thus, compared to [4], one should make the 
transformation («,/3) rx (— /3, — a). Note also the symmetry h a p =hp a . 

>From [4] we see (after transformation) that, in order to obtain a genuine Bregman 
generator, the following restrictions apply to a and /3 : Either < a < 1 and /3 < or 
else a <0and0< /3 < 1. 

The partial complexity function and the coding function are given by: 



4>a,fi fry) = ( " ( 1 " oc)xy- a + ( 1 - p )xy-P - ay l - a + ^) 



(19) 
(20) 



Note that ?c(0) = °« except if either a = or /3 = (then fc(0) = (a + j3 - l)/(a + jS)). 

The important inverse functions are defined on [0, ?c(0)]. They can only be 
calculated in closed form in special cases. We point to the Tsallis case which corresponds 
to a < 1, j8 = 0. The Tsallis parameter, traditionally denoted by q, is then given by 
q = 1 — a. For the origin to this family within the physics literature, see Tsallis, [11]. 
Let us put fC a .o = K q (as above with q = 1 — a). Then, for q ^ 1, 



Kq \ x ) = (l + — -x) q ~ l for < x < K q {Q) 



(21) 
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and one can insert (21) into (5). The kind of sums obtained will, typically, have to be 
calculated numerically. An exception is the case q = 2. We leave it to the reader to work 
out the pleasent details of our calculus in this case (take A to be finite). 

Another case where K" a jg can be calculated in closed form is the Kaniadakis family 

which corresponds to a = — j8, cf. Kaniadakis [12]. We shall not go into that here. 

DISCUSSION 

Some features of the main result. Theorem 2 provides a theoretical framework for 
MaxEnt calculations for natural preparations given by linear constraints and pertaining 
to a wide range of different entropy measures. Among special features as compared with 
the standard approach we mention the following: 

The basis for the result is the game theoretical approach which necessitates a focus 
on possibly unfamiliar aspects and quantities, notably a focus on a notion of complexity, 
intended to reflect the interplay between the physicist and the system he is studying. 
This aspect could have been hidden, but the underlying principle - the principle of Game 
Theoretical Equilibrium - is in itself promoted as a major issue. Indeed, it is suggested 
that this principle is of a basic nature, applicable to several scientific investigations, and 
that, for the area of statistical physics, it is more fundamental than Jaynes Maximum 
Entropy Principle. The principle originated with Pfaffelhuber [13] and, independently, 
the author (with [14] the first publication in English). Among further studies, we mention 
the joint work [15] with Harremoes. 

Another feature is the puzzling fact that optimization has been achieved "miracu- 
lously" without recourse to Lagrange multipliers. Many will find it difficult to accept 
that for the problem studied, an approach which is better - simpler and more illuminat- 
ing - than the well proven technique involving the popular multipliers exists. Within the 
mathematical literature, this special feature goes back at least to Csiszar, cf. [23]. 

Finally, we note that the MaxEnt calculus outlined here has no mention of partition 
functions. The calculus goes a good deal beyond traditional settings based on classical 
BGS-entropy. This has resulted in a focus on Ao which corresponds to the logarithm 
of the partition function in the classical case (so, for the classical case, we can write 
Ao = lnZ(A) where Z(A) = £exp(— A •/(/))). It is well known that InZ is a key quantity 
to work with, thus this feature should be no great surprise. But it is interesting that our 
approach leads directly to this quantity. As the partition function has no place for the 
general case covered by Theorem 2, this is of course also forced in some sense. 

Exponential families. Whereas the concept of partition function does not survive 
the extension to general entropy- and complexity measures, the notion of exponential 
families does. It even appears to be the central concept behind the approach taken, cf. 
Theorem 1. However, extensions of this concept are needed (see below). 

Comparing with the classical approach. The simplifications in the classical case result 
from the factorization property of ?c _1 , an exponential function in that case. Apart from 
this, the calculations for a general complexity function appear to be of much the same 
nature as for the classical case. Indeed, given A = (Ai, • • • , AjQ one determines Ao from 
(5) and then, via (6), (7) leads to the relevant averages a = (a\, •••,%). If you aim for a 
specific set of averages, there seems to be no way, neither in the classical case nor in the 
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general setting, other than application of numerical optimization procedures to choose 
just that set of parameters X which leads to the appropriate set of constrained values. 
This discussion then tells us that apart from the simplifications possible in handling (5), 
the general calculus suggested is no more complicated in practise than what you are used 
to from classical studies. 

Thermodynamic calculus. The difficulties, indeed impossibilities, involved in finding 
solutions to MaxEnt problems in closed form for other than the simplest problems 
constitute part of the motivation to create a thermodynamic calculus, studying variation 
as functions of various parameters of significance to the physicist or chemist. In this 
way one hopes to develop useful approximate solutions or to discover interesting trends 
in the thermodynamics as response to changes of relevant parameters. The differential 
calculus needed for such endeavours appears to be applicable also to the general setting 
of Theorem 2 with its precise equations to look closer into. Studies of this kind are not 
taken up here. 

Natural expansions, optimal opdating based on a prior. There are many further 
possibilities for theoretical investigations based on measures of complexity of the form 
here studied. Assumptions related to the form (2) allows one to derive several results 
other than Theorem 2: Uniqueness of Q determined from X, convexity of the set of A's 
for which Q can be found, convexity of the function X rx Xq = Xq{X) (this corresponds 
in the classical case to log-convexity of the partition function), existence of equilibria 
for the models in the natural family and, as a consequence, concavity of the map a rx 
MaxEnt(4>,^ fl ). 

We comment that whereas measures of complexity of the special form (8) are rather 
simple and quite a rich family, the more elaborate form given by (2) is also of importance 
- especially, it allows the consideration of Renyi entropies and related quantities. 

A special expansion of the concept of robustness which allows identification of 
MaxEnt-distributions for which some of the point probabilities (the q t of Theorem 2) are 
allowed to be should also be mentioned. This concerns cases where Xq + X ■ f(i) > k(0) 
and is therefore only relevant when k(0) < °°. However, there are important cases where 
this is so, e.g. Tsallis-type quantities with q > 1. In such cases inconsistent inference is 
possible where a. feasible i (one for which there exists P G 0^ a with pi > 0) is inferred 
under MaxEnt-based inference as an impossible event. This phenomenon is treated in 
part by Jaynes, cf. p. 345 of [22]. Taking this into consideration, it appears possible 
to prove that any candidate to MaxEnt-distributions (or the more general centers of 
attraction of [15]) of preparations in a natural family of preparations, must be a member 
of the associated exponential family. For the classical case, where inconsistent inference 
is not possible, such a result was established in [15]. 

Consider now the problem of optimal updating based on a given prior. In fact, 
such problems can be handled in analogy with our analysis of MaxEnt problems. In 
particular, a result a la Theorem 2 holds which provides a calculus for optimal posterior 
distributions via a minimum cross entropy principle - the kind of results initiated by 
Kullback, cf. [24]. To indicate, if only briefly, that this requires no new techniques, 
consider a prior Qq and try to maximize the updating gain (P, Q) = 4>(i>, Qq) — 
4>(i>, Q). This situation can be analyzed by applying our game theoretical reasoning to 
~~ ^Qq wn i cn is a genuine complexity measure. For this to work, the theory has to be 
extended slightly, allowing complexity measures that can take negative values. 
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Precise statements and proofs of results just indicated will be published elsewhere. 

Origin of the two-parameter family. The two-parameter family of complexity-, 
entropy- and divergence measures, (^> a R,H a «,D a a) has its origin in the mathemat- 
ical literature, cf. Mittal [8] and Sharma and Taneja [9], and was studied later in the 
physical literature by Borges and Roditi, [10] who used the convenient concept of 
deformed logarithms. 

Entropy should not stand alone. Let us illustrate this thesis by considering Tsallis 
entropy with Tsallis parameter q. There are infinitely many ways of obtaining this 
entropy measure as minimal complexity. Below we suggest three complexity measures 
which have this property: 

* C ttG) = y^E/>?(l -<?}"*) (23) 

4> ^2) = T ^-(-^L-l)- < 24) 

As usual, sums are over i G A. The "B " , "C" and "R" stand for, respectively 
"Bregman " , "Csiszar" and "Renyi" . The complexity measure 3> s is the one considered 
in the main text, 3» c the one considered in [4] and Q R is closely related to the relevant 
complexity measure connected with Renyi entropy and divergence. 

The measure 3> s allows us - as we have seen - to study the natural preparations 
given by linear constraints, 4> c allows us to develop a calculus much as Theorem 2, but 
aiming at maximizing entropy for preparations given by averaging with respect to the 
q-associated measures which are measures with point masses pi and finally, allows 
us to deal with preparations given by averages with respect to the q-escort distributions 
which are obtained by normalizing the ^-associated measures. To realize that this is 
indeed so, you just have to note how P enters in the complexity measure considered. 
It can safely be argued that "distorted" averages as those indicated above related to 
4> c and $> R have no physical relevance and therefore, they are considered of less or no 
importance for the study of natural maximum entropy problems. Bregman generation is 
thus the method which stands back as the really significant method. 

The importance of Bregman type quantities. The relevance for statistical physics of 
Bregman divergence was emphasized by Naudts [1], [2]. The work by Abe and Bagci 
[3] should also be mentioned, however, the present author does not agree with their 
conclusion that the use of escort distributions is essential. Anyhow, the proper matching 
of entropy measure with the type of constraints one wants to study is important. This 
issue is also addressed in Feng [20]. 

Originally, Bregman introduced the concept to meet needs of learning theory, cf. [21]. 
For more recent articles in this direction, see Murata et al., [19] and Sears [18]. 

Concerning extensions in another direction, to quantum statistical physics, note the 
recent study by Petz, [17] where Bregman divergences are carefully defined. Incorpora- 
tion of game theoretical considerations may be a fruitful area of research to look into. 

Interpretations. Any measure of entropy of importance to statistical physics should 
be motivated by sound reasons, including appropriate interpretations. It appears that 
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Bregman generation in itself goes a way in this direction. In addition, the choice of 
terminology, especially regarding the frequent reference to "coding" , though not yet 
founded in precise procedures for observation or measurement, is indicative for what 
future research may bring, at least this is where speculations of the author goes. 

One should recall that Kullback-Leibler divergence is related to free energy for clas- 
sical preparations. This kind of interpretation when more general Bregman-type diver- 
gences are involved appears also to be sound, cf. the recent study by Bagci, [16]. Possi- 
bly, Crooks, [25], also points to issues to be integrated before a full picture is in place. 
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